Abstract: Sentiment Analysis mainly refers to analyze feelings, emotions, or opinion of people expressed through social media, blogs and reviews. It extracts customer’s reviews from the web and classifies the reviews using sentiment classification approach whether it is positive or negative. This paper proposes a new technique for sentiment classification to select most important features using different feature weights. Firstly, different data pre-processing techniques are applied on the labeled polarity movie reviews; Yelp restaurant and Amazon product reviews dataset. Secondly, Information Gain, Uncertainty and Gini Index methods are used to select most influential features. Finally, the sentiment classification task is done using Rapid Miner, an open source data mining tool. The performance of Support Vector Machine (SVM), is examined in combination with different feature selection schemes to obtain the results for Sentiment Analysis. The paper concludes with the investigation of experimental results show the effectiveness of the classifier with Information Gain.
Keywords: Gini Index, Information Gain, Sentiment Analysis, Support Vector Machine, Opinion, Uncertainty